Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 381109 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 57.5 MiB |
| Average record size in memory | 158.2 B |
Variable types
| Numeric | 10 |
|---|---|
| Categorical | 6 |
age_damage_premium is highly skewed (γ1 = 168.2687719) | Skewed |
id is uniformly distributed | Uniform |
id has unique values | Unique |
Reproduction
| Analysis started | 2021-02-13 20:42:06.192227 |
|---|---|
| Analysis finished | 2021-02-13 20:42:45.894614 |
| Duration | 39.7 seconds |
| Software version | pandas-profiling v2.10.0 |
| Download configuration | config.yaml |
| Distinct | 381109 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 190555 |
|---|---|
| Minimum | 1 |
| Maximum | 381109 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 19056.4 |
| Q1 | 95278 |
| median | 190555 |
| Q3 | 285832 |
| 95-th percentile | 362053.6 |
| Maximum | 381109 |
| Range | 381108 |
| Interquartile range (IQR) | 190554 |
Descriptive statistics
| Standard deviation | 110016.8362 |
|---|---|
| Coefficient of variation (CV) | 0.5773495117 |
| Kurtosis | -1.2 |
| Mean | 190555 |
| Median Absolute Deviation (MAD) | 95277 |
| Skewness | 9.443273511 × 1016 |
| Sum | 7.26222255 × 1010 |
| Variance | 1.210370425 × 1010 |
| Monotocity | Strictly increasing |
| Value | Count | Frequency (%) |
| 2049 | 1 | < 0.1% |
| 99738 | 1 | < 0.1% |
| 19875 | 1 | < 0.1% |
| 17826 | 1 | < 0.1% |
| 23969 | 1 | < 0.1% |
| 21920 | 1 | < 0.1% |
| 109983 | 1 | < 0.1% |
| 107934 | 1 | < 0.1% |
| 114077 | 1 | < 0.1% |
| 112028 | 1 | < 0.1% |
| Other values (381099) | 381099 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 |
| Value | Count | Frequency (%) |
| 381109 | 1 | |
| 381108 | 1 | |
| 381107 | 1 | |
| 381106 | 1 | |
| 381105 | 1 |
gender
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.9 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 381109 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 1 |
| Value | Count | Frequency (%) |
| 0 | 206089 | |
| 1 | 175020 |
| Value | Count | Frequency (%) |
| 0 | 206089 | |
| 1 | 175020 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 206089 | |
| 1 | 175020 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 381109 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 206089 | |
| 1 | 175020 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 381109 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 206089 | |
| 1 | 175020 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 381109 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 206089 | |
| 1 | 175020 |
age
Real number (ℝ≥0)
| Distinct | 66 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.82258357 |
|---|---|
| Minimum | 20 |
| Maximum | 85 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.9 MiB |
Quantile statistics
| Minimum | 20 |
|---|---|
| 5-th percentile | 21 |
| Q1 | 25 |
| median | 36 |
| Q3 | 49 |
| 95-th percentile | 69 |
| Maximum | 85 |
| Range | 65 |
| Interquartile range (IQR) | 24 |
Descriptive statistics
| Standard deviation | 15.51161102 |
|---|---|
| Coefficient of variation (CV) | 0.3995512301 |
| Kurtosis | -0.5656550665 |
| Mean | 38.82258357 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 0.6725389977 |
| Sum | 14795636 |
| Variance | 240.6100764 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 24 | 25960 | 6.8% |
| 23 | 24256 | 6.4% |
| 22 | 20964 | 5.5% |
| 25 | 20636 | 5.4% |
| 21 | 16457 | 4.3% |
| 26 | 13535 | 3.6% |
| 27 | 10760 | 2.8% |
| 28 | 8974 | 2.4% |
| 43 | 8437 | 2.2% |
| 44 | 8357 | 2.2% |
| Other values (56) | 222773 |
| Value | Count | Frequency (%) |
| 20 | 6232 | 1.6% |
| 21 | 16457 | |
| 22 | 20964 | |
| 23 | 24256 | |
| 24 | 25960 |
| Value | Count | Frequency (%) |
| 85 | 11 | < 0.1% |
| 84 | 11 | < 0.1% |
| 83 | 22 | < 0.1% |
| 82 | 29 | |
| 81 | 56 |
region_code
Real number (ℝ≥0)
| Distinct | 53 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.3888074 |
|---|---|
| Minimum | 0 |
| Maximum | 52 |
| Zeros | 2021 |
| Zeros (%) | 0.5% |
| Memory size | 13.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 15 |
| median | 28 |
| Q3 | 35 |
| 95-th percentile | 47 |
| Maximum | 52 |
| Range | 52 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 13.22988803 |
|---|---|
| Coefficient of variation (CV) | 0.5013446733 |
| Kurtosis | -0.8678571198 |
| Mean | 26.3888074 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | -0.1152664149 |
| Sum | 10057012 |
| Variance | 175.0299372 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 28 | 106415 | |
| 8 | 33877 | 8.9% |
| 46 | 19749 | 5.2% |
| 41 | 18263 | 4.8% |
| 15 | 13308 | 3.5% |
| 30 | 12191 | 3.2% |
| 29 | 11065 | 2.9% |
| 50 | 10243 | 2.7% |
| 3 | 9251 | 2.4% |
| 11 | 9232 | 2.4% |
| Other values (43) | 137515 |
| Value | Count | Frequency (%) |
| 0 | 2021 | 0.5% |
| 1 | 1008 | 0.3% |
| 2 | 4038 | |
| 3 | 9251 | |
| 4 | 1801 | 0.5% |
| Value | Count | Frequency (%) |
| 52 | 267 | 0.1% |
| 51 | 183 | < 0.1% |
| 50 | 10243 | |
| 49 | 1832 | 0.5% |
| 48 | 4681 |
policy_sales_channel
Real number (ℝ≥0)
| Distinct | 155 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 112.0342947 |
|---|---|
| Minimum | 1 |
| Maximum | 163 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 26 |
| Q1 | 29 |
| median | 133 |
| Q3 | 152 |
| 95-th percentile | 160 |
| Maximum | 163 |
| Range | 162 |
| Interquartile range (IQR) | 123 |
Descriptive statistics
| Standard deviation | 54.20399477 |
|---|---|
| Coefficient of variation (CV) | 0.4838160935 |
| Kurtosis | -0.9708101781 |
| Mean | 112.0342947 |
| Median Absolute Deviation (MAD) | 19 |
| Skewness | -0.9000081235 |
| Sum | 42697278 |
| Variance | 2938.07305 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 152 | 134784 | |
| 26 | 79700 | |
| 124 | 73995 | |
| 160 | 21779 | 5.7% |
| 156 | 10661 | 2.8% |
| 122 | 9930 | 2.6% |
| 157 | 6684 | 1.8% |
| 154 | 5993 | 1.6% |
| 151 | 3885 | 1.0% |
| 163 | 2893 | 0.8% |
| Other values (145) | 30805 | 8.1% |
| Value | Count | Frequency (%) |
| 1 | 1074 | |
| 2 | 4 | < 0.1% |
| 3 | 523 | |
| 4 | 509 | |
| 6 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 163 | 2893 | 0.8% |
| 160 | 21779 | |
| 159 | 51 | < 0.1% |
| 158 | 492 | 0.1% |
| 157 | 6684 | 1.8% |
driving_license
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.9 MiB |
| 1 | |
|---|---|
| 0 | 812 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 381109 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
| Value | Count | Frequency (%) |
| 1 | 380297 | |
| 0 | 812 | 0.2% |
| Value | Count | Frequency (%) |
| 1 | 380297 | |
| 0 | 812 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 380297 | |
| 0 | 812 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 381109 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 1 | 380297 | |
| 0 | 812 | 0.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 381109 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 1 | 380297 | |
| 0 | 812 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 381109 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 1 | 380297 | |
| 0 | 812 | 0.2% |
vehicle_age
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.9 MiB |
| 1 | |
|---|---|
| 0 | |
| 2 | 16007 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 381109 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2 |
|---|---|
| 2nd row | 1 |
| 3rd row | 2 |
| 4th row | 0 |
| 5th row | 0 |
| Value | Count | Frequency (%) |
| 1 | 200316 | |
| 0 | 164786 | |
| 2 | 16007 | 4.2% |
| Value | Count | Frequency (%) |
| 1 | 200316 | |
| 0 | 164786 | |
| 2 | 16007 | 4.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 200316 | |
| 0 | 164786 | |
| 2 | 16007 | 4.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 381109 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 1 | 200316 | |
| 0 | 164786 | |
| 2 | 16007 | 4.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 381109 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 1 | 200316 | |
| 0 | 164786 | |
| 2 | 16007 | 4.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 381109 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 1 | 200316 | |
| 0 | 164786 | |
| 2 | 16007 | 4.2% |
vehicle_damage
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.9 MiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 381109 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 0 |
| Value | Count | Frequency (%) |
| 1 | 192413 | |
| 0 | 188696 |
| Value | Count | Frequency (%) |
| 1 | 192413 | |
| 0 | 188696 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 192413 | |
| 0 | 188696 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 381109 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 1 | 192413 | |
| 0 | 188696 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 381109 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 1 | 192413 | |
| 0 | 188696 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 381109 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 1 | 192413 | |
| 0 | 188696 |
previously_insured
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.9 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 381109 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 1 |
| Value | Count | Frequency (%) |
| 0 | 206481 | |
| 1 | 174628 |
| Value | Count | Frequency (%) |
| 0 | 206481 | |
| 1 | 174628 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 206481 | |
| 1 | 174628 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 381109 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 206481 | |
| 1 | 174628 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 381109 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 206481 | |
| 1 | 174628 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 381109 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 206481 | |
| 1 | 174628 |
annual_premium
Real number (ℝ≥0)
| Distinct | 48838 |
|---|---|
| Distinct (%) | 12.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 30564.38958 |
|---|---|
| Minimum | 2630 |
| Maximum | 540165 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.9 MiB |
Quantile statistics
| Minimum | 2630 |
|---|---|
| 5-th percentile | 2630 |
| Q1 | 24405 |
| median | 31669 |
| Q3 | 39400 |
| 95-th percentile | 55176 |
| Maximum | 540165 |
| Range | 537535 |
| Interquartile range (IQR) | 14995 |
Descriptive statistics
| Standard deviation | 17213.15506 |
|---|---|
| Coefficient of variation (CV) | 0.563176798 |
| Kurtosis | 34.0045687 |
| Mean | 30564.38958 |
| Median Absolute Deviation (MAD) | 7504 |
| Skewness | 1.766087215 |
| Sum | 1.164836395 × 1010 |
| Variance | 296292707 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 2630 | 64877 | 17.0% |
| 69856 | 140 | < 0.1% |
| 39008 | 41 | < 0.1% |
| 38287 | 38 | < 0.1% |
| 45179 | 38 | < 0.1% |
| 30117 | 36 | < 0.1% |
| 43707 | 36 | < 0.1% |
| 35074 | 35 | < 0.1% |
| 36086 | 35 | < 0.1% |
| 38452 | 34 | < 0.1% |
| Other values (48828) | 315799 |
| Value | Count | Frequency (%) |
| 2630 | 64877 | |
| 6098 | 1 | < 0.1% |
| 7670 | 1 | < 0.1% |
| 8739 | 1 | < 0.1% |
| 9792 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 540165 | 4 | |
| 508073 | 1 | < 0.1% |
| 495106 | 1 | < 0.1% |
| 489663 | 1 | < 0.1% |
| 472042 | 3 |
vintage
Real number (ℝ≥0)
| Distinct | 290 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 154.3473967 |
|---|---|
| Minimum | 10 |
| Maximum | 299 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.9 MiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 24 |
| Q1 | 82 |
| median | 154 |
| Q3 | 227 |
| 95-th percentile | 285 |
| Maximum | 299 |
| Range | 289 |
| Interquartile range (IQR) | 145 |
Descriptive statistics
| Standard deviation | 83.67130363 |
|---|---|
| Coefficient of variation (CV) | 0.5420972781 |
| Kurtosis | -1.200688042 |
| Mean | 154.3473967 |
| Median Absolute Deviation (MAD) | 73 |
| Skewness | 0.00302951689 |
| Sum | 58823182 |
| Variance | 7000.887051 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 256 | 1418 | 0.4% |
| 73 | 1410 | 0.4% |
| 282 | 1397 | 0.4% |
| 158 | 1394 | 0.4% |
| 187 | 1392 | 0.4% |
| 31 | 1388 | 0.4% |
| 160 | 1388 | 0.4% |
| 226 | 1388 | 0.4% |
| 131 | 1387 | 0.4% |
| 245 | 1387 | 0.4% |
| Other values (280) | 367160 |
| Value | Count | Frequency (%) |
| 10 | 1311 | |
| 11 | 1344 | |
| 12 | 1257 | |
| 13 | 1329 | |
| 14 | 1260 |
| Value | Count | Frequency (%) |
| 299 | 1283 | |
| 298 | 1384 | |
| 297 | 1284 | |
| 296 | 1302 | |
| 295 | 1275 |
response
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.9 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 381109 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 0 |
| Value | Count | Frequency (%) |
| 0 | 334399 | |
| 1 | 46710 | 12.3% |
| Value | Count | Frequency (%) |
| 0 | 334399 | |
| 1 | 46710 | 12.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 334399 | |
| 1 | 46710 | 12.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 381109 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 334399 | |
| 1 | 46710 | 12.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 381109 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 334399 | |
| 1 | 46710 | 12.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 381109 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 334399 | |
| 1 | 46710 | 12.3% |
age_damage
Real number (ℝ≥0)
| Distinct | 66 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4340.278361 |
|---|---|
| Minimum | 2 |
| Maximum | 7324 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.9 MiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 1451 |
| Q1 | 2600 |
| median | 4322 |
| Q3 | 5845 |
| 95-th percentile | 7141 |
| Maximum | 7324 |
| Range | 7322 |
| Interquartile range (IQR) | 3245 |
Descriptive statistics
| Standard deviation | 1899.273687 |
|---|---|
| Coefficient of variation (CV) | 0.4375925986 |
| Kurtosis | -1.202209861 |
| Mean | 4340.278361 |
| Median Absolute Deviation (MAD) | 1559 |
| Skewness | 0.06601177171 |
| Sum | 1654119146 |
| Variance | 3607240.537 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 6659 | 25960 | 6.8% |
| 6979 | 24256 | 6.4% |
| 7141 | 20964 | 5.5% |
| 4791 | 20636 | 5.4% |
| 7324 | 16457 | 4.3% |
| 2909 | 13535 | 3.6% |
| 2408 | 10760 | 2.8% |
| 2424 | 8974 | 2.4% |
| 5845 | 8437 | 2.2% |
| 5771 | 8357 | 2.2% |
| Other values (56) | 222773 |
| Value | Count | Frequency (%) |
| 2 | 11 | < 0.1% |
| 7 | 11 | < 0.1% |
| 10 | 22 | < 0.1% |
| 13 | 29 | |
| 30 | 56 |
| Value | Count | Frequency (%) |
| 7324 | 16457 | |
| 7141 | 20964 | |
| 6979 | 24256 | |
| 6659 | 25960 | |
| 5845 | 8437 | 2.2% |
vintage_annual_premium
Real number (ℝ≥0)
| Distinct | 302632 |
|---|---|
| Distinct (%) | 79.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 363.7404254 |
|---|---|
| Minimum | 8.795986622 |
| Maximum | 33717.28571 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.9 MiB |
Quantile statistics
| Minimum | 8.795986622 |
|---|---|
| 5-th percentile | 12.23255814 |
| Q1 | 117.3202847 |
| median | 194.7523364 |
| Q3 | 374.1889764 |
| 95-th percentile | 1345.682989 |
| Maximum | 33717.28571 |
| Range | 33708.48973 |
| Interquartile range (IQR) | 256.8686917 |
Descriptive statistics
| Standard deviation | 548.6531224 |
|---|---|
| Coefficient of variation (CV) | 1.508364438 |
| Kurtosis | 107.7098648 |
| Mean | 363.7404254 |
| Median Absolute Deviation (MAD) | 103.7178537 |
| Skewness | 5.741464175 |
| Sum | 138624749.8 |
| Variance | 301020.2488 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 15.20231214 | 261 | 0.1% |
| 109.5833333 | 256 | 0.1% |
| 9.460431655 | 256 | 0.1% |
| 12.00913242 | 256 | 0.1% |
| 87.66666667 | 256 | 0.1% |
| 12.11981567 | 255 | 0.1% |
| 17.53333333 | 255 | 0.1% |
| 41.74603175 | 254 | 0.1% |
| 9.392857143 | 254 | 0.1% |
| 10.31372549 | 253 | 0.1% |
| Other values (302622) | 378553 |
| Value | Count | Frequency (%) |
| 8.795986622 | 231 | |
| 8.825503356 | 235 | |
| 8.855218855 | 245 | |
| 8.885135135 | 217 | |
| 8.915254237 | 190 |
| Value | Count | Frequency (%) |
| 33717.28571 | 1 | |
| 25948.23077 | 1 | |
| 25876.53846 | 1 | |
| 22506.875 | 1 | |
| 21637.5 | 1 |
age_vintage
Real number (ℝ≥0)
| Distinct | 12101 |
|---|---|
| Distinct (%) | 3.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 168.5820243 |
|---|---|
| Minimum | 24.41471572 |
| Maximum | 2920 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.9 MiB |
Quantile statistics
| Minimum | 24.41471572 |
|---|---|
| 5-th percentile | 33.18181818 |
| Q1 | 56.47169811 |
| median | 91.8707483 |
| Q3 | 172.8947368 |
| 95-th percentile | 589.6153846 |
| Maximum | 2920 |
| Range | 2895.585284 |
| Interquartile range (IQR) | 116.4230387 |
Descriptive statistics
| Standard deviation | 230.4170123 |
|---|---|
| Coefficient of variation (CV) | 1.36679467 |
| Kurtosis | 23.01507351 |
| Mean | 168.5820243 |
| Median Absolute Deviation (MAD) | 44.77397411 |
| Skewness | 4.110853617 |
| Sum | 64248126.69 |
| Variance | 53091.99956 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 365 | 1366 | 0.4% |
| 121.6666667 | 1346 | 0.4% |
| 182.5 | 1329 | 0.3% |
| 91.25 | 1258 | 0.3% |
| 73 | 1161 | 0.3% |
| 60.83333333 | 994 | 0.3% |
| 52.14285714 | 828 | 0.2% |
| 146 | 694 | 0.2% |
| 730 | 663 | 0.2% |
| 243.3333333 | 657 | 0.2% |
| Other values (12091) | 370813 |
| Value | Count | Frequency (%) |
| 24.41471572 | 23 | |
| 24.4966443 | 26 | |
| 24.57912458 | 26 | |
| 24.66216216 | 27 | |
| 24.74576271 | 23 |
| Value | Count | Frequency (%) |
| 2920 | 5 | |
| 2883.5 | 4 | |
| 2847 | 5 | |
| 2810.5 | 6 | |
| 2774 | 4 |
| Distinct | 271591 |
|---|---|
| Distinct (%) | 71.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.61946659 |
|---|---|
| Minimum | 0.3590933916 |
| Maximum | 27663 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.9 MiB |
Quantile statistics
| Minimum | 0.3590933916 |
|---|---|
| 5-th percentile | 0.5166994106 |
| Q1 | 4.186395939 |
| median | 6.994108473 |
| Q3 | 11.65038462 |
| 95-th percentile | 26.56232203 |
| Maximum | 27663 |
| Range | 27662.64091 |
| Interquartile range (IQR) | 7.463988676 |
Descriptive statistics
| Standard deviation | 117.9591069 |
|---|---|
| Coefficient of variation (CV) | 11.10781845 |
| Kurtosis | 33237.54538 |
| Mean | 10.61946659 |
| Median Absolute Deviation (MAD) | 3.456939872 |
| Skewness | 168.2687719 |
| Sum | 4047174.292 |
| Variance | 13914.35089 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.3949541973 | 2629 | 0.7% |
| 0.3768448202 | 2496 | 0.7% |
| 0.5489459403 | 2250 | 0.6% |
| 0.3682957569 | 2106 | 0.6% |
| 0.9040907528 | 1852 | 0.5% |
| 0.3590933916 | 1830 | 0.5% |
| 1.092192691 | 1642 | 0.4% |
| 0.4829232464 | 1624 | 0.4% |
| 1.084983498 | 1603 | 0.4% |
| 0.4499572284 | 1549 | 0.4% |
| Other values (271581) | 361528 |
| Value | Count | Frequency (%) |
| 0.3590933916 | 1830 | |
| 0.3682957569 | 2106 | |
| 0.3768448202 | 2496 | |
| 0.3949541973 | 2629 | |
| 0.4499572284 | 1549 |
| Value | Count | Frequency (%) |
| 27663 | 1 | |
| 26120.5 | 1 | |
| 25939.5 | 1 | |
| 24683.5 | 1 | |
| 21944.5 | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| id | gender | age | region_code | policy_sales_channel | driving_license | vehicle_age | vehicle_damage | previously_insured | annual_premium | vintage | response | age_damage | vintage_annual_premium | age_vintage | age_damage_premium | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 0 | 44 | 28.00000 | 26.00000 | 1 | 2 | 1 | 0 | 40454.00000 | 217 | 1 | 5771 | 186.42396 | 74.00922 | 7.00988 |
| 1 | 2 | 0 | 76 | 3.00000 | 26.00000 | 1 | 1 | 0 | 0 | 33536.00000 | 183 | 0 | 837 | 183.25683 | 151.58470 | 40.06691 |
| 2 | 3 | 0 | 47 | 28.00000 | 26.00000 | 1 | 2 | 1 | 0 | 38294.00000 | 27 | 1 | 5090 | 1418.29630 | 635.37037 | 7.52338 |
| 3 | 4 | 0 | 21 | 11.00000 | 152.00000 | 1 | 0 | 0 | 1 | 28619.00000 | 203 | 0 | 7324 | 140.98030 | 37.75862 | 3.90756 |
| 4 | 5 | 1 | 29 | 41.00000 | 152.00000 | 1 | 0 | 0 | 1 | 27496.00000 | 39 | 0 | 2504 | 705.02564 | 271.41026 | 10.98083 |
| 5 | 6 | 1 | 24 | 33.00000 | 160.00000 | 1 | 0 | 1 | 0 | 2630.00000 | 176 | 0 | 6659 | 14.94318 | 49.77273 | 0.39495 |
| 6 | 7 | 0 | 23 | 11.00000 | 152.00000 | 1 | 0 | 1 | 0 | 23367.00000 | 249 | 0 | 6979 | 93.84337 | 33.71486 | 3.34819 |
| 7 | 8 | 1 | 56 | 28.00000 | 26.00000 | 1 | 1 | 1 | 0 | 32031.00000 | 72 | 1 | 2763 | 444.87500 | 283.88889 | 11.59283 |
| 8 | 9 | 1 | 24 | 3.00000 | 152.00000 | 1 | 0 | 0 | 1 | 27619.00000 | 28 | 0 | 6659 | 986.39286 | 312.85714 | 4.14762 |
| 9 | 10 | 1 | 32 | 6.00000 | 152.00000 | 1 | 0 | 0 | 1 | 28771.00000 | 80 | 0 | 2560 | 359.63750 | 146.00000 | 11.23867 |
Last rows
| id | gender | age | region_code | policy_sales_channel | driving_license | vehicle_age | vehicle_damage | previously_insured | annual_premium | vintage | response | age_damage | vintage_annual_premium | age_vintage | age_damage_premium | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 381099 | 381100 | 1 | 51 | 28.00000 | 26.00000 | 1 | 1 | 1 | 0 | 44504.00000 | 71 | 0 | 4077 | 626.81690 | 262.18310 | 10.91587 |
| 381100 | 381101 | 1 | 29 | 28.00000 | 124.00000 | 1 | 0 | 1 | 0 | 49007.00000 | 137 | 0 | 2504 | 357.71533 | 77.26277 | 19.57149 |
| 381101 | 381102 | 1 | 70 | 28.00000 | 122.00000 | 1 | 2 | 1 | 0 | 50904.00000 | 215 | 0 | 1398 | 236.76279 | 118.83721 | 36.41202 |
| 381102 | 381103 | 1 | 25 | 41.00000 | 152.00000 | 1 | 0 | 1 | 1 | 2630.00000 | 102 | 0 | 4791 | 25.78431 | 89.46078 | 0.54895 |
| 381103 | 381104 | 0 | 47 | 50.00000 | 26.00000 | 1 | 1 | 1 | 0 | 39831.00000 | 235 | 0 | 5090 | 169.49362 | 73.00000 | 7.82534 |
| 381104 | 381105 | 0 | 74 | 26.00000 | 26.00000 | 1 | 1 | 0 | 1 | 30170.00000 | 88 | 0 | 1084 | 342.84091 | 306.93182 | 27.83210 |
| 381105 | 381106 | 0 | 30 | 37.00000 | 152.00000 | 1 | 0 | 0 | 1 | 40016.00000 | 131 | 0 | 2511 | 305.46565 | 83.58779 | 15.93628 |
| 381106 | 381107 | 0 | 21 | 30.00000 | 160.00000 | 1 | 0 | 0 | 1 | 35118.00000 | 161 | 0 | 7324 | 218.12422 | 47.60870 | 4.79492 |
| 381107 | 381108 | 1 | 68 | 14.00000 | 124.00000 | 1 | 2 | 1 | 0 | 44617.00000 | 74 | 0 | 1451 | 602.93243 | 335.40541 | 30.74914 |
| 381108 | 381109 | 0 | 46 | 29.00000 | 26.00000 | 1 | 1 | 0 | 0 | 41777.00000 | 237 | 0 | 5471 | 176.27426 | 70.84388 | 7.63608 |